Commodore BASIC

Commodore BASIC, also known as PET BASIC, is the dialect of the BASIC programming language used in Commodore International's 8-bit home computer line, stretching from the PET of 1977 to the C128 of 1985. The core was based on 6502 Microsoft BASIC, and as such it shares most of the core code with other 6502 BASICs of the time, such as Applesoft BASIC.

Contents

History

Commodore licensed BASIC from Microsoft on a "pay once, no royalties" basis for US$25,000 (Different sources range this amount between $10,000 and $30,000). Bill Gates first offered it at a $3 per unit royalty fee but Jack Tramiel turned it down stating "I'm already married", said he would pay no more than $25,000 for a perpetual license and Gates later came back and accepted the deal.[1] Commodore took the source code of the flat-fee BASIC and further developed it internally for all their other 8-bit home computers. It was not until the Commodore 128 (with V7.0) that a Microsoft copyright notice was displayed. However, Microsoft had built an easter egg into the version 2 or "upgrade" Commodore Basic that proved its provenance: typing the (obscure) command WAIT 6502, 1 would result in Microsoft! appearing on the screen. (The easter egg was well concealed—the message did not show up in any disassembly of the interpreter.)[2]

Technical details

A convenient feature of Commodore's ROM-resident BASIC interpreter and KERNAL was the full-screen editor, which allowed users to enter direct commands or to input and edit program lines from anywhere on the screen—simply by pressing the RETURN key whenever the cursor happened to be on a line containing a valid BASIC statement. This marked a significant change in program entry interfaces compared to other common home computer BASICs at the time, which typically used line editors, invoked by a separate EDIT command, a "copy cursor," Escape sequences, or the like.

It also had the capability of saving named files to any device, including the cassette – a popular storage device in the days of the PET. Most systems of the era only supported filenames on diskette, which made saving multiple files on other devices more difficult. The user of one of these systems was required to note the recorder's counter display at the location of the file, but this was inaccurate and prone to error. Many non-Commodore users worked around the problem by only recording one file per tape. With the PET, when the user requested to load a file by name from the cassette, the device would read data sequentially, ignoring any non-matching filenames until the named file was reached and read into memory. The file system was also supported by a powerful record structure that could be loaded or saved to files. Another difference between the cassette transfer implementations of the Commodore and other systems was that Commodore tapes were encoded digitally, where other manufacturers used a less expensive analog interface which enabled the use of a standard tape recorder, but was much less reliable.

Like the original Microsoft BASIC interpreter, on which it is based, Commodore BASIC is slower than native machine code. Test results have shown that copying 16 kilobytes from ROM to RAM takes less than a second in machine code, but over a minute in BASIC. To execute faster than the interpreter, programmers started using various techniques to speed up execution. One was to store often-used integer values in variables rather than using literal values, as interpreting a variable name was faster than interpreting a literal number. When speed was important, some programmers converted sections of BASIC programs to 6502 assembly language and executed them from BASIC using the SYS command; at other times, when speed was too fast, programmers dropped back to BASIC, and polled various addresses in memory (such as $A6 for the C-64, or $D0 for the C-128, denoting size of the keyboard queue) before they could start executing again.

Commodore BASIC keywords could be abbreviated by entering first an unshifted keypress, and then a shifted version of the next keypress. These two characters were then parsed according to a lookup table, and accepted as a substitute for typing the entire command out. However, as BASIC keywords were stored in memory as single byte tokens, this was a convenience for statement entry rather than an optimization.

In the default uppercase-only character set, shifted characters appear as a graphics symbol; e.g. the command, GOTO, could be abbreviated G{Shift-O} (which resembled GΓ onscreen). Most such commands were two letters long, but in some cases they were longer. In cases like this, there was an ambiguity, so more unshifted letters of the command were needed, such as GO{Shift-S} (GO♥) being required for GOSUB. Some commands had no abbreviated form, either due to brevity or ambiguity with other commands. For example, the command, INPUT had no abbreviation because its spelling collided with the separate INPUT# keyword, which was located nearer to the beginning of the keyword lookup table.

By abbreviating keywords, it was possible to view more code than would otherwise be possible on a single line (line lengths were usually limited to 2 or 4 screen lines, depending on the specific machine). This allowed for a slight saving on the overhead to store otherwise necessary extra program lines, but nothing more. All BASIC commands were tokenized and took up 1 byte (or two, in the case of several commands of BASIC 7 or BASIC 10) in memory no matter which way they were entered.

In the rare situation when commercial BASIC software was meant to be LIST-ed, each token's keyword was spelled out in full, leading to a line that extended over more screen lines than could be handled by the Logical Line Link Table. If programmers intended editing of their software by users, the user might nevertheless have found it daunting to edit with the on-screen editing capabilities. LISTing these long lines on early Commodore 64s near the bottom of the screen could trigger the "push-wrap-crash" bug in the 40 column screen editor, causing the machine to crash or return an OUT OF MEMORY error.

Commodore BASIC lines did not need any spaces except where omitting one would be ambiguous, and many Commodore BASIC programs were written with no spaces, e.g., 100IFA=5THENPRINT"YES":GOTO160. Omitting spaces as such would lead to a more compact program, since the tokenizer never removes any space inserted between keywords: the presence of spaces results in extra 0x20 bytes in the tokenized program which are merely skipped during execution.

The order of execution of Commodore BASIC lines was not determined by line numbering; instead, it followed the order in which the lines were linked in memory[3]: much like a modern singly linked list, each program line was stored in memory as a line number, a pointer, and then the tokenized code for the line. The pointer contained the address in memory of the next program line. While a program was being entered, BASIC would constantly reorder program lines in memory so that the line numbers and pointers were all in ascending order. However after a program was entered, manually altering the line numbers and pointers with the POKE commands could allow for out-of-order execution or even give each line the same line number. In the early days, when software written in BASIC was available commercially, this was a software protection technique used to discourage casual modification of the program.

Variable names were only significant to 2 characters; thus the variable names VARIABLE1, VARIABLE2 and VA all referred to the same variable.

The native number format of Commodore BASIC, like that of its parent MS BASIC, was floating point. Most of the contemporary BASIC implementations used one byte for the characteristic (exponent) and three bytes for the mantissa. This led to problems in business applications since the accuracy of a floating point number using a three-byte mantissa is only about 6.5 decimal digits, and round-off error is common. Commodore, however, used MS BASIC's four-byte mantissa, which made their BASIC much more adapted for business than most other BASICs of the era.

Also akin to MS BASIC, 16-bit signed integers (i.e. in the range -32768 to 32767) were available by postfixing a variable name with a percent symbol, and string variables were represented by postfixing the variable name with a dollar sign. Despite the 2 character limit on variable names, the variables AA$, AA, and AA% would each be understood as distinct.

Many BASIC extensions were released for the Commodore 64, due to the relatively limited capabilities of its native BASIC 2.0. One of the most popular extensions was the DOS Wedge, due to its inclusion on the Commodore 1541 Test/Demo Disk. This 1 KB extension to BASIC added a number of disk-related commands, including the ability to read a disk directory without destroying the program in memory. Its features were subsequently incorporated in various third-party extensions, such as the popular Epyx FastLoad cartridge. Other BASIC extensions added additional keywords to make it easier to code sprites, sound, and high-resolution graphics like Simons' BASIC.

From a modern programming point of view, the earlier versions of Commodore BASIC presented a host of bad programming traps for the programmer. As most of these issues derived from Microsoft BASIC, virtually every home computer BASIC of the era suffered from similar deficiencies.[4] BASIC line-numbering meant that with bad planning, inserting lines in a program often meant restructuring the whole program (later BASIC versions included a DELETE and RENUMBER command, as well as an AUTO line numbering command that would automatically insert line numbers based on a selected increment). In addition, all variables are treated as global variables. Clearly defined loops are hard to create, often causing the programmer to rely on the GOTO command (this was later rectified in BASIC 3.5 with the addition of the DO, LOOP, WHILE, UNTIL, and EXIT commands). Flag variables often needed to be created to perform certain tasks. Furthermore, the 80 character line limit in earlier versions of Commodore BASIC often meant splitting tasks up into multiple routines, often resulting in spaghetti code. Earlier BASICs from Commodore also lack debugging commands, meaning that bugs and unused variables are hard to trap.

Versions and features

A list of CBM BASIC versions in chronological order, with successively added features:

Released versions

Unreleased versions

Notable extension packages

References

Notes
BASIC 2.0
  • Angerhausen et al. (1983). The Anatomy of the Commodore 64 (for the full reference, see the C64 article).
BASIC 3.5
  • Gerrard, Peter; Bergin, Kevin (1985). The Complete COMMODORE 16 ROM Disassembly. Gerald Duckworth & Co. Ltd. ISBN 0-7156-2004-5.
BASIC 7.0
  • Jarvis, Dennis; Springer, Jim D. (1987). BASIC 7.0 Internals. Grand Rapids, Michigan: Abacus Software, Inc. ISBN 0-916439-71-2.
BASIC 10.0
  • Commodore 65 preliminary documentation (March 1991), with addendum for ROM version 910501. c65manual.txt